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P. A. W. Lewis, Naval Postgraduate School , Monterr*y, SA., '...A, 
E. McKenzie, University of Strathclyde, Glasgow, Scotirind, U.K. 
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Abstract 



The Beta~Gamma transf orrnat ion is described and is used I - define j v^ry 
simple first-order autor egr ess i ve Beta-Gamma process, BGAhl'li. Maximum 
likelihood estimation is discussed for this model, as w(-ll .'i mcment 
estimators. The first-order structure is extended to include movir.g averagf- 
processes and mixed first-order au t or egr essi ve , pth-order movinr tverage 
processes. It is shown that these Gamma processes are t irne-rf ver s i i . and, 
therefore, too narrow for general physical modelling. A dual procc'ss to tri^^ 
BGAR(l) process, DBGAR(l), is introduced, as well as an it<^ratfd prc -ers 
which combines the Beta-Gamma process and the GAR( 1 ) pr^oct-ss of 'Taver an 1 
Lewis (i960). Some properties of these extended autoregress i ve processes are 
derived. Several highly nonlinear extensions of these processes which 
produce negative correlation are given. Use of the processes to model a 
sequence of times between failures of a computer system is describf-d. 

0. INTRODUCTION 



The Gamma distribution is used to model a wide variety of positive 
valued random quantities in fields such as operations analysis, reliability 
theory, hydrology and meteorology. Thus, service time distributions and 
interarrival times in queues are often modelled as having Gamma distribu- 
tions, as are wind velocities (Hugus, 1982; Brown, Katz and Mur'phy, 
measured at successive discrete time points or river flows at su^cessiv*- 
instants of time (Lawrance and Kottegoda, 1977). In all these cases, the 
measurements are taken serially in time and are apt to be sei'ially 
dependent. Thus, development of time series with Gamma distribute 1 marginal 
distributions and various correlation structures is of great imparl 

Gaver and Lewis (1980) showed that the usual linear first-order 
autor egr ess i ve equation, = pX^_^+E^, would yield Gamma rnarginil 
distributions if the i.i.d. sequence {E^} was chosen with suitable marginal 
distribution. This Gamma innovation distribution has a positive pr'cl abi 1 it, v 
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of being zero, so that the process (GAR(l)) generates sample paths which 
exhibit Vuns^down’ (as seen in river flow data), but which are "defective”. 
The defect lies in the fact that when = 0, and are proportional 
and p can be estimated exactly in long enough time series. Moreover, the 
probability of the defect is higher for k small, which is precisely where 
the model is needed, since the usual techniques of transforming to normality 
are then questionable and probably undesirable. 

Bernier (1970) introduced the GAR( 1 ) model in a hydrological context 
and McKenzie (1982) introduced a multiplicative Gamma process called PAR(l) 
~ product autor egress ion of order one. 

In Lewis (1983) and Hugus (1982), a simple linear, random coefficient 
model called BGAR(l) was introduced. It is based on the Beta-Gamma 
transformati on described in Section 1. 

The purpose of this paper is to develop the properties of this BGAR(l) 
model and to extend the idea to moving average and mixed autoregressi ve 
structures. In particular, it is shown that these processes, like the 
Gaussian ARMA(p,q) models, are time reversible and therefore are very 
part icular . 

Several schemes for broadening the structure of Gamma time series are 
given. In particular, a technique of iteration produces a Gamma autoregres- 
sive process with two structional parameters that can model, for given 
marginal distribution and serial correlation, different kinds of sample path 
behavior. Some nonlinear schemes that produce negative serial correlation 
are also introduced. 

It is important to note the multiplicity of Gamma processes which can 
be derived with given first- and second-order structure. Conseqently, in 
the absence of a ’natural* structure such as exists for Gaussian processes, 
our aim has been to produce simple structures, i.e., linear, additive, 
random coefficient processes. 

Finally, a series of times between failures of digital computers 
(Lewis, 1964) is analyzed and fitted with the model. The data is serially 
correlated with a marginal distribution which is more skewed than an 
exponential distribution. Although this data is known to be generated by a 
branching Poisson process (cluster process), the Gamma model is much simpler 
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and much more tractable than the cluster process, and provid<-s an adeq<iate 
representat ion for most purposes. 



1 « PRELIMINARIES 

In what follows we will use B(m,n), or simply B when the paramot er i 
tion is clear, to stand for a Beta random variable with parameters m > 0 and 
n > 0, denoted by Beta(m,n). The probability density function for a 
Beta(m,n) random variable is 

fg(x;m,n) = Y"{m)rin) ^ (1“^) » 0 ^ x ^ 1 ; m > 0 ; n^ 0 , M.l) 

where r(0 is the complete Gamma function. 

We will denote by {B(m,n), B ^ ( m *, n ’),••• 1 an i.i.d. sequence of vector 
random variables whose components are independent Beta random variables. 

Let G(k,6) stand for a Gamma random variable with shape parameter 
k > 0, and rate parameter 6 > 0, denoted by Gamma(k,B). The probability 
density function for a Gamma(k,6) random variable is 

k k^l -6x 

f^(x;k,6) = — . X > 0: 6 > 0; k > 0. (1.2) 

We will denote by {G^(k,6)}, an i.i.d. sequence of Gamma variates. 

A Gamma(k,6) random variable has moments 

? “-I /2 

E(G) = k/B = u; Var(G) = k/B ; C(G) = s.d(G/p) = k , (1.3) 

where C(G) is the coefficient of variation, and Laplace-^Stieltjes transform 

Lg<“) ■ ■ (rTTr)"' 

The Gamma variable is sometimes par ameter i zed in terms of the parameter 
p = E(G). This is useful in statistical work, since the mean is a multi^ 
plicative parameter and G can be written as a unit^mean Gamma variate, G* , 
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times u, i.e., G = pG* . However, in what follows, we will use the fact that 
two independent Gamma variates with the same 6*"parameters , but possibly 
different shape parameters, add to give another Gamma variate 

G”(k^k»,6) = G(k,6) ^ G'(k*,6). (1.5) 



The result is not true if the Gamma variates have the same mean but 
different shape parameters. 

The Gamma family of random variables include the Exponen t i al { k =1 ) , 
Erlang(k integer) and Chi^Square(k=r/2 , r=1,2,--*; 6=2) random variables. 

Gamma and Beta variates are intimately related and two of their 
properties will be used throughout this paper. 

(i) A Beta(m,n) variate may be generated as 



B(m,n) 



G^ (m,B) 

G'(m,B) + G”(n,B)’ 



( 1 . 6 ) 



where G'(m,B) and G"(n,B) are independent. Furthermore, the ratio B(m,n) is 
independent of the denominator G’(m,6) + G’’(n,B) = G(m'^n,B) and this 
property character i zes the Gamma random variable (Johnson and Kotz, 1970a). 

(ii) The Beta Gamma transf ormat ion . Multiplying a G(m+n,B) random 
variable by an independent B(m,n) random variable gives a G(m,B) random 
variable 



G(m,B) = B(m,n)G(m+n, 6) . (1.7) 

Thus one can reduce the shape parameter of a Gamma random variable (multiply 
by a Beta random variable), as well as increase it (add an independent Gamma 
variate, as at (1.5)). A heuristic argument for the result (1,7) is that if 
we wanted to perform the operation (1.7) on a computer, we could first 
generate G’(m,B) and G”(m,6) to form the ratio (1.6) to obtain the B(m,n) 
variate. There is, however, no need to generate G(m+n,B) in (1.7), we can 
use G’(m,6) + G”(n,6), which is independent of the Beta variate. 
Multiplication then gives G(m,6) = GMm,6). 



(iii) A formal proof of the Beta~Gamma transformation is a sperial 
case of the following Lemma, which will be used in the sequel. 

Lemma . Let X(k,6) be a Gamma random variable and let B(kp,kp) be a 
Beta random variable, which is independent of X(k,6), with p = 1 -p 
lying in (0,1). Then 



-(v-^Bu)X, 
E{ e } 



r B ^kp / B ^kp 
6+v B'*'V+u 



V > 0, u £: 0. 



( 1 .b) 



When V = 0, this result proves the Beta-Gamma transf ormat i on in the 

form 



X(kp,6) = B( k p : kp )X (k , 6) . 



( 1 . 9 ) 



Proof . Using (1.^) and conditioning on B, we get 






e 

B-^v+Bu 



E (-S- . 6 

[ 6-^v J 




e(bS , 



Where we have used the finiteness of the expectation to take the expectation 
inside of the binomial expansion. 

Now since E(B^) = B( k p + 51 , k p ) /B( k p , k p ) = f( k p + 2.) f( k p ) f( k ) / { f( k + 5L) f( k p ) f( k p ‘ ) 
we have that 



PJ-’{e(6L . ( 



k + 2,-1 



^(kp-^g-) r(kp)r(k) 

r(kU) r(kp)f(kp) 



r(k^^)r(kp^-e,)r(kp)r(k) 
r(k)r( n+ 1 ) r(kp) r(kp)r(k-*-d) 



/kp+i-1 
^ i 
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Thus 






£=0 



il=0 






kp 



kQ ( 'i “k p 



g ] > - 

6+v+u J [ 6+v 



and therefore 



-(v+Bu)X, ( 6 ikpf B ikp 

I ■ W* 



which was to be proved. 



2. THE FIRSTHDRDER AUTOREGRESSIVE PROCESS, BGAR( 1 ) 



2.1. Construction of the Process . 

Using the Beta-Gamma t r ansf ormat i on , we can construct a very simple 
first-order autogressive process {X^(k,6,p)} with Gamma(k,6) marginal 
distribution and a single parameter, p, that describes the dependency 
structure of the process. We have 

X (k,6,p) = B (kp,kp)X (k,6,p) + B'(kp,kp)G (k,6) (0 ^ p < 1) (2.1) 

n n n-1 n n 

= B^(kp,kp)X^_^ (k,6, p) Y^(kp,B) n=0,±1,..., (2.2) 

where (Y^(kp,B)} are i.i.d. Gamma(kp,B) variates independent of the 
{B^(kp,kp)} sequence. If X^_^(k,B,p) has a Gamma(k,B) marginal 
distribution, then multiplying by B^(kp,kp) reduces it to a Gamma(kp,B) 
variate and adding the innovation variable Y^(kp,B) creates the Gamma(k,B) 
variate, X^(k,B). The alternate form (2.1) shows the process as a 
transf ormat ion of an i.i.d. Gamma(k,B) sequence, but clearly generation on a 
computer would be done with (2.2). In the sequel, we will drop the 
parametric notation where no confusion is possible. 
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It is clear that taking to be a Gamma(k,6) variate will start the 
process in a stationary mode. Also, the process is Markovian by 
construction . 

2.2. Serial Correlation . 

It is easily established, using moments of Beta variables, that 

p(r) = Corr(X ,X ) = p^l, r = 0,±1,±2,--- . (2.3) 

n n-r 

Thus, in the three parameter process, the parameters k and 6 describe the 
marginal distribution of the process and p independently describes the 
correlation structure. Note that since the process is only defined for 
0 ^ p < 1 the correlations are non-negative. 

2 . 3 . Joint Laplace-Stieltjes transform . 

In the stationary process (X }, let L (u,v) denote the joint 

n A , A , 

n n-1 

Laplace-Stielties transform of the adjacent variables X and X Then we 

^ ^ ^ n n-1 

have 



4 ,X ■ E[exp|-X„u-X^_,vi] 

n n-1 

= E[exp(-B X -U-Y u-X v}] 
n n-1 n n-1 

~(v+Bu)X. f .. 

== E(e )E{ e ) , (2. 

where, in the last step, we have dropped the indices n and n-1 because of 
stationarity and have used the assumed independence of Y^ and fo write 

the expectation as the product of two expectations. 

Now the second term is evaluated in the Lemma of Section 1 and we have 
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S ,X 
n n-1 



r g B '|kpj' B ^kp 

^B+u"^ ^B+v"^ ^B+v+u'^ 



[• B , B ■jkpr B ■jkp 
6+u ^B+v+u 



(2.5) 



Since this transform is symmetric in u and v, the joint distribution of 

and is symmetric. Also, since the joint distribution of any set of 

X^^s can be obtained from (2.5) and the marginal Gamma distribution, the 

Beta-Gamma process is time-reversible. 

Note, too, that we have directly from the defining equation (2.2) that 

the regression of X on X , = x is linear: 

n n-1 

E(Xn|Xn-i=x) = px + (l-p)k/B = px + (1-p)p. (2.6) 



The time-reversibility of the process shows that 



E(Xn_i ) = py + ( 1 -p) P. 



(2.7) 



2 . . Convergence to a Gaussian AR(1) Process . 

1 /2 

As k gets large, the standardized Gamma(k,l) variate X’ = (X-k)/k 

converges to a standardized Gaussian variate. To prove this, consider the 

pair X^ and in the BGAR(1) process. The joint characteristic function 

for the standardized variables X* and X^ , is, using (2.5), 

n n- 1 

(p (s,t) = E{exp(-isX'-itX’ )} 

X , X , n n-1 

n n-1 . _ - _ - 

ik (s+t)f, . ,-1/23 ...-1/23 •.-’1/2, 

= e [l-^isk ] [1+itk ] {1+ik (s+t)l 

-1 /2 

Taking logarithms and expanding in powers of k gives 

1 /? — -1 /? ? 

^x, (s,t) = ik (s+t) - kpfisk “(is) /(2k)+0(k )} 

^ n-1 - -1 /2 2 - 3/2 

- kpfitk -(it) /(2k)+0(k )} 

- kp(ik ^ (s+ t )- ( i )^ ( s+ t )^/ ( 2k ) +0 ( k 
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and as k-^®, this converges to 



X’ " 

n' n-1 



~ -{ s^+t^+2pst} 



Thus, since the process is Markovian, the BGAR(l) process is r-qui valent to a 
Gaussian AR(1) process when k becomes large. 



2.5. Additivity and the GAR(1) Process . 

Gaver and Lewis (1980) showed that the usual linear stochastic 
difference equation 



will give a process (GAR(l)) with Gamma(k) marginal distributions if F.^ is 

the Gamma innovation variable GI (k,p) with Lapl ace~S t i el t j es transform 
k ^ 

{ ( 1 +ps)/ ( 1 +s) ) , where 0 ^ p < 1. This variable can be simulated by methods 

given by Lawrance (1982) and McKenzie (1986). 

Note that the GAR(l) process is a linear additive (constant 

coefficient) process and adding two independent GAR(l) processes with the 

same 6 and p values, say gives a new GAR(l) process 

with shape parameter and dependency parameter p. This is not true for 

the BGAR(1) process which is a random coefficient, linear additive process. 

The process obtained by addition is a process with Gamma marginals and 

Ir I 

correlation structure p(r) = p' but it is not even Markovian. Additional 

differences between the processes are that while the BGAR(l) process is 
t i me~rever s i bl e , the GAR( 1 ) process exhibits 'runs down'. In fact, it is 

degenerate in the sense that the innovation variable GI is zero with 

k ^ 

probability p . Thus, we get = p with probability p , and p can be 

estimated exactly in long enough series (Gaver and Lewis, 1980). This 

degenerate behavior is not exhibited by the BGAR(l) process. A method for 

combining two processes to obtain a broader process is described in Section 

5 . 



9 



2.6. The Conditional Density for X , Given X 
From the definition (2.2), we have that 



y • 



P{X S xlX = z} = P{B (kp,kp)z + Y (kp,6) ^ x} 
n ' n~l n n 

= P{Y^(kp,6) ^ X - B^(kp,kp)z}. 



Now, by definition, Y^(kp,6) is independent of X^_^ and of B^(kp,kp). Thus 
conditioning on B^ and differentiating with respect to x yields the 
conditional density for X^, given == z, as 




(x,y) 



r(k) 

r(kp) {r(kp)}" 



, kp -kx 
k e 



X > 0, y > 0 



X j w^^ ^(1-w)^^ \x-yw)e^^^dw, (2.10) 

0 



where 



L = 






1 if X ^ y, 

x/y if X < y, 



( 2 . 11 ) 



and, for simplicity, the scale parameter B has been set equal to one. 

This density is a continuous function of x and absolutely continuous 
except where x = y. Its utility is in obtaining maximum likelihood 
estimates of the parameters 6 (or p), k and p. This is discussed in the 
next subsection. 



2.7. Moment and Maximum Likelihood Estimates in BGAR( 1 ) . 

There are ’’natural” moment estimators for the three parameters in the 
BGAR(l) model, namely the mean, p, (or B) , the shape parameter, k, and the 
first-order serial correlation coefficient, p. From (2.3) and (1.3), these 
estimates are, from a sample (observed time series) of length n. 



( 2 . 1 . ; 



_ n 

M = X = I X./n, 
i = 1 

k = (X) , (2. 1 3) 

where is the sample variance, and 

n-1 _ _ 

p = I (X -X)(X -X)/l(n-1)S^}. (2.1^; 

i = l 

n 

V r* 

The variance of p is var(X)[l-*-2 I {l-(r/n)}p ]/n and nVar(u) is 

r = 1 

asymptotically equal to u ^ ( 1 + p ) / { ( 1 -p ) k } . More general properties of 

estimates R and p, however, are hard to derive. But, their distributions 

are independent of the scale parameter u. 

In Table 2.1, we give the simulated standard deviation and bias of tho 

estimates R and p for values of k = 0.25, 0.75, 1.0, 2.0 and p = 0.25, 

0.60, 0.90 for various values of n. Note that the values of n differ with 

. 1 /2 

p, since the "equivalent sample size", n’=n[(1'^p)/{{1-p)k}j , is 

different. Here n’ is the sample size which would be needed, for given p, 
to achieve the same variance for u as in the p=0 (independence) case. The 
simulation study was performed with the SUPER-SIMTBED program of Lewis, et 
al. (1985). 

Two other properties of the process may be useful in validating the 
BGAR( 1 ) model from data. 

The first is that difference of successive values in the time series, 

D = X “X , have an 5. -Laplace distribution (Dewald, 1985). This result 
n n+ 1 n 

comes f rom (2. ll) , 
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after converting to characteristic functions and setting v 
the characteristic function 



“U . We have 



= 4 ,X 

n n-1 



6 



6+iu B“iu 



kp r n 



6^+u- 



kp 



This is the characteristic function of an Jl-Laplace random variable with 
i = kp; the distribution is symmetric about zero. It goes to a Normal 
random variable as but for ^ 1 , the density function is not 

absolutely continuous at zero. In fact, for i ^ 1/2, the density is 
infinite at zero. The fact that the difference has median value 0, 
irrespective of the value of p or 6, can be useful in validating the model. 

Consider, now, ratios R == X , /X ; from (2.2) this is 

n n+1 n 



“ B (kp,kp) + Y (kp,6)/X . 
n n+1 n+1 n 

But Y^_^^(kp,6) and X^ are independent Gamma variates, so that, if k>l , 

E(R ) = E(B (kp,kp) + E{Y (kp,6)/X } 
n n+1 n+1 n 

= p + (kp/6)/((k-1)/6} = 1 + p/(k-l). (2.15) 

Higher moments can also be obtained. 

These results could be of use in validating the model. Another 
possibility for validating models is the higher-order residual analysis of 
Lawrance and Lewis (1986). 



CMR 



Joi nt -max i mum likelihood estimates for k and ^ (and perhap-. p- 

obtained from the conditional density (2.10) and the formula for the .joint 

density of X ,X is 

n n-1 1 



f (x ; X 
n n-1 



|X ^^^X |X 

n ' n-1 n-1 ' n-2 



( X , ; X , J - 
n-1 n-2 



t f ( y j 

x^ r * 






where f (x ) is a Gamma(k,l) density. 

Hugus (1982) used (2.16) to obtain j oi nt -max i mum likelihood e/.timates 
for k and p. Three cases are shown in Figure 2.1. Generally, when k is 
greater than 1, moment estimates of k and p are quite efficient, which 
agrees with results for independent Gamma(k,l) variates (Bartlett and 
Kendall, 19^46). However, when k is less than one, the maximum likelihood 
estimates become much more efficient than the moment estimators. 

The moment estimators, u, <, p, given at (2.12), (2.13) and serve 

as good starting points for numerical evaluations to find tho maximum 
likelihood estimates, u, k, p, of k and p. Techniques for the numerical 
integration of (2.10) are given in Hugus (1982). 

3. MOVING AVERAGE AND HIGHER ORDER AUTOREGRESSIVE STRUCTURES 

The Beta-Gamma t r ans f orma t i on can be used to generate dependency 
structures other than first-order aut or egr ess i ve structures for Gamma 
distributed time series. Several of these structures are given in this 
section . 

3.1. The First-Order Moving Average Process, BGMA(1) . 

The first-order moving average process, the BGMA(1), is constructed in 
essentially the same way as the BGAR(1) above. If IG^^} is a sequence of 

i.i.d. Gamma(k,6) random variables and (B^} is an independent sequence of 

i.i.d. Beta(ka,ka) random variables, where a = 1-u, wo define the (backward^ 
BGMA(l) process (X^} by 



15 



4 - 



4 * 

o 

4 - 



o + 




04 - 



4 - 



4 - 



o 



o 



-»■ 4 - 



O 



o 




82 0 



>Z 0 



02 0 



91 0 



d jdi9UJ0J0<j uoijoiajjoo |0 9 jdujhs3 




d j9i9UJ0J0cJ UOilOl^iJOO |0 3)OUJIlS3 




d J9)SUJ0J0<J UOliOldJJOQ |0 3>0UJI>S3 



Figure 2. 1 

Simulations of joint maximum likelihood estimates for k and p and joint moment 
estimates f or k and p, for three different sets of values of the parameters. 
Ten replications for each case. Top left figure: p=0.75, k=^.0; top right 
figure: p=0.25, k=4,0; bottom figure: p=0.75, k=0.75. The symbol (+) refers 
to estimates of p; the symbol (o) refers to estimates of k. Large symbols are 
joint estimates; small symbols are marginal estimates. 
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0 ^ a < 1 . 



(3.ly 



Evidently, (X^} is a stationary process and and independent 

if |r| >1. The marginal distribution of X^ may be derived by noting that 

the right-hand side of (3.1) is the sum of a G(k,6) random variable and an 

independent G(ka,6) random variable. Thus, X is a G(k(l-^a),6} random 

n 

variable. This process has the same structure as the usual Gaussian MA( 1 ) 
process, except that here the coefficient, B^, is a random variable rather 
than a constant. An immediate effect of this construction is that the 
observed and innovation processes, ^^n^* respectively, have 

different Gamma marginal distributions. This is in contrast to the 
structure of the EARMA processes ( Lawrance and Lewis, 1980), where it was 
deliberately arranged that they should have the same distribution. However, 
as we shall see shortly, this disparity in marginal structure has some 
advantages . 

From the viewpoint of modelling, it is more useful to determine the 

parameters of the innovation process in terms of those of the observed 

process. For this reason, we reparameter i ze (3.1) slightly. We consider 

(G ) to be i.i.d. Gamma(k/(Ma),6) r.v.s. and (B ) to be i.i.d. 
n n 

Beta(ka/( 1 +a) , ka/(l+a)} random variables. This yields an observed process 

(X ) which is Gamma (k, 6). 
n 

We may note that if we write p = Py( 1) = a/(1+a), then G is 

A n 

Gamma(kp,6), the same innovation process as for the BGAR(1) process. 

3.2. Autocorrelat i on Function for the BGMA(1) Process . 

The aut ocor r el at i on function for the moving average process may be 
determined directly. Thus, Cov(X^,X^_^) = Cov ( B^G^_ ^ , G^_ ^ ) = E(B)Var(G^, 
and so 



For |k| » 1, the attainable range of correlation is 0 ^ ^ which 

is the full possible range of positive correlation. This is because, for a 
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f irs t"order moving average, |p^(1)| = 0.5; see, e.g. Hugus (1982). The fact 
that the whole positive range is available is important, because it is in 
contrast to the EMA models (Lawrance and Lewis, 1977; 1980), where 
correlation is bounded above by 0.25. The greater flexibility in (3-2) is a 
result of the innovation and observed processes having different 
distributions. Since for k = 1, the Gamma distribution is an exponential 
distribution, the BGMA(l) process is then a broader first-order exponential 
moving average process than the EMA(l) process. 



3.3* Joint Distributions . 

The bivariate Laplace transform of derived by using 

(1.8). Thus, again, using the notation a=1-a, we have 

L(u,v) = E{exp(-uX -vX } = E{exp(“uG -uB .G -vG -vB G ,) 
n + 1 n n+1 n+1 n n n n-1 

= L^(u)E{L^(v+uB)}Lg^(v) 



6 )““( 6 )"“( 6 f’' 



B-^u J [ 6+v J [ 6+u+v 



(3*3) 



using the Lemma above. This has exactly the same form as the joint 

transform, (2.5), of ^or the BGAR(1) process with a corr espondi ng 

to p. This, too, corresponds to the behavior in Gaussian processes, where 

the joint distributions for the autor egr essi ve and the moving average 

processes have the same form and differ only in their autocorrelat ion 

functions. An immediate consequence is that the conditional distribution of 

X , given X , and X given X , , have exactly the same form as for the 

BGAR(1). We note the somewhat unusual result for a non-Gaussian process 

that regression is linear in both directions, even though the process is a 

moving average. In fact, E(X ,|X =x) = E(X |X , =x ) = ax+ka/6 = ax+aE(X). 

n+1 ' n n ' n+1 

The joint Laplace transform of any finite set of consecutive 

observations can be obtained by the procedure that yielded (3*3). Thus, the 

joint transform of (X ,X ,,•••, X ,) is given by 

n n-1 n-r+1 
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L(u^ .u^, • • • ,u^) 



6+u, 



ka r- 1 
X II 



l=c 






(b^u J 



r 

II 

i = : 



sKji 



II - — : 

1 1-1 



I V ^ , 



Note that this is not the r-'dimensional transform for the b'JAF'if 1 j pro^*^'S.j. 
Equality holds for only r = 2. 

One consequence of (3.^) is that, since L(u^,u.,-**,.^’ = 

L( u^ , u^_^ , • • • , u ^ ) , the process is time-reversible. 

The bivariate Gamma distribution whose transform is [^iv^^n at ( ^ i is 
well known (Johnson and Kotz, 1970b, p. ..^19) and is called by Ghirti:: 
the double Gamma distribution. Gince the multivariate form of' ^hic 
bivariate Gamma distribution arises as the individual sums of rn i ndei er. iurO 
Gamma variates with a common, independent Gamma variate, it is doubtful that 
triples, say ’ ^n* BGAR(1) process would have ih::- 

multivariate distribution. In fact, (3.^0 shows that this is not .so for tKm- 
moving average process. 

Another result that we can immediately derive from the Joint trcinsf'^.r'm 

is the distribution of the sum of n consecutive observations. This has a 

r 

particularly simple form for the BGMA(l) process. If T = 2 X. then 

1 = 1 

L^(u) = L( u ,u , • • • , u ) , which, from (3.^), is given by 

k (r-2(r-l )a} ^ . 

Further, since 0 $ a S 0.5, we can rewrite (3.5) in terms of rinO'.m 
variables as 



L.j,(u) = 



— 

[6+u 



k (r-1 )a 



(7.' 



T^ = G[ k { r-2 ( r“-1 ) a 1 , 6 ] 2G( k ( r~ 1 ) (^ , i , 

where the two Gamma random variables are independent. 

3 .^. Higher Order Moving Average Processes . 

Higher order moving average processes may be constructed by extending 
the BGMA(l) in an obvious way. Thus, the GBMA(q) is given by 
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G + 
n 



(3.6) 



X 



n 



q 

I 



i-1 



B . G . , 
n,i n-i 



q 



where (G } is a sequence of i.i.d. Gamma{k/ ( 1 + la.), 6} random variables and 

n ^ 1 

(B .} is an independent sequence of i.i.d. vector random variables with 
n , 1 

q q 



B i = 1,2,***,q, being a {ka . /( 1 +Za . ) , ka./(1+Za.)} random variable. In 

this case, {X^} is a stationary Gamma(k,6) process with X^, independent 

when |r| > q. The autocorrelation function is given by 



P,(r) 



a 

r 



q-1 

I 



j=i 



a .a . 



J J 




/ 



q 

a ) r=l ,2, . . . ,q 

j = 1 



0 



r > q, 



which, again, is a close analogue of the usual autocorrelation function for 
the Gaussian MA(q) process. The major difference is that all the 
correlations are non-negati ve . 



THE MIXED AUTOREGRESSIVE, MOVING AVERAGE PROCESS, BGARMA(1,1) 



A more complicated dependence structure in Gamma distributed variables 
that is the analog of the usual linear ARMA(p,q)-type process is now given. 

^.1. Structure of the Mixed Process, BGARMA(1,1) . 

We can construct an ARMA-type process with a Gamma marginal distribution 
by combining the two first-order processes we have discussed above. For 
convenience, we write each in a slightly different form. The moving average 
component is given by 



X 

n 



n-1 



B G , 
n n 



(^. 1 ) 



where tB^} are as given in Section 3.1 above, i.e., independent 

sequences of i.i.d. Gamma { k / ( 1 +a ) , 6 } random variabes and i.i.d. 
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Beta{ka/( 1 +a) , ka/(1+a)l random variables, respectively. Notice that B is 

n 

the coefficient of G in (^.1), whereas it was associated with G , in 

n n~ 

(3*1). Clearly, this change will make no distributional difference and, as 
we shall see, renders the parameters of the ARMA model more readily 
interpretable. The sequence {Y^} is generated from a BGAR(1) process given 
by 



Y 

n 



A Y . 
n n~1 



+ A’G . 
n n 



i^.2) 



The process is as above and {A^} and (AM are independent sequences of 

i.i.d. Beta{kp/{ 1 +a) ,kp/ ( 1 +a) } and i.i.d. Be t a ( k p/ ( 1 + a ) , k p / ( M m ) ) random 

variables, respectively. If Y is G{ { k/ ( 1 +a) , B } , then so also is Y and 

n~l n 

the required stationary process { results. , The product is, of 

course, the Gamma { k p/ ( 1 ) , 6 } random variable used as the innovation process 
in Section 2.1 above, but is written in this form here to make explicit the 
dependence on the innovation sequence G^. 



I|.2. The Autocorrelat ion Function of the BGARKA(1,1) Process . 

may be derived in the usual way, 



The aut ocorr el at i on function of {X 

n 

Thus, Cov(X ,X ) = Cov(Y ,,Y ,) + Cov(Y 

n n-r n-1 n-r-1 

Now, Cov(Y , ,Y ,) = p'"var(Y), and Cov(Y ,,B G ] 
, n-l n-r-1 n~l n~r n-r 

— r “ 1 

app Var(G), so that we obtain 



, ,B G ; 
n-l n-r n~r 



=T.a 



ap r-1 
P 



r = 1 , 2, 



(^. 3 ) 



This is the form of the aut ocorrelat i on function of the ARMA(1,1) process. 
Note, too, that the choice of the slightly different structure for (^.1) and 
(^.2) has endowed the two parameters (p,a) with physical s i gn i f i cance . 
Choosing a=0 effectively sets B^ to zero, so that the process is now simply 
the BGAR(l) process, and we can see f rom (^.3) that becomes p^, as 
expected. If we choose p=0, then Y becomes G and (X } is the BGY!A(l) 
process, and (^.3) reduces to a/(l+a), as it should. The reduction of the 
mixed model to its simpler forms when the parameters vanish is a consequence 
of the structure we have chosen. 
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^.3. Joint Distributions in the BGARMA(1,1) Process. 



The joint distribution of two consecutive observations of the process 
(X^} can be derived in the form of the corr espondi ng Laplace transform. 
Thus , 



L(u,v) = E(exp(-uX ~vX ) 
n+1 n 

= E[exp(-u(A Y +A»G ) - uB 0 - vY - vB G } ] 

^ n n“1 n n n+1 n+1 n~1 n n 

= Lg^(u).E(L^(v+uA)}.E{L^(uA’ + vB)}. 

The first two terms of this product have already been evaluated and we now 
consider the third. By considering appropriate series expansions, we can 
evaluate the third term in the form 



00 00 



I I 



( ~u)^ (~v)^ r( gQ+ n ) r( pQ-^m) r( 0^m-*-n) r( Q) 



m ! 



n! 



m=0 n=0 ““ r(a0)r(p0)r(0+n)r(Q+m) 

where 0 = k/(l+a). Hence, we can show that L(u,v) is given by 

1 0(g+p) 



1 


0p 


1 


(1 +u) (1 +v) 




1 +U+V 



2 1 



F,(0p;eo.e,:f^), 



(4,5) 



where is the Hypergeometric function, defined by 



F (a b c z) = y i^^a ^n)r(b^ .nm_ci . ^ 
2 r(a)r(b)r(c+n) nl 



n 



The behavior and properties of this function are detailed in Abramowitz and 
Stegun (196^, Ch. 15). It is easily verified that when p=0 or a=0, the 
appropriate forms of L(u,v) result from (^.5). The transform corr espondi ng 
to higher-dimensional distributions are more difficult to obtain in closed 
form, although, in principle, series expansions of the form of (^.^) can be 
derived . 
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Further, the symmetry of L(u,v) in u and v implies that the conditional 
distributions of X given X and X given X , are identical. In 
particular, we can recover the conditional moments from i^.b) and it is 
found that regression is linear and the conditional variance is quadratic in 

X . 

n 



Higher Order Mixed Processes . 

Higher order BGARMA processes can be derived by suitable extensions of 
the BGARMA(1,1). In particular, it is strai ght f orwar d to construct a 
BGARMA(l,q) process by replacing by an MA(q) form as in (3.6). Thus, 

q-1 

X = I B .G . + Y 
n . ^ n-i n-i n~q 

replaces (^.1) and (^.2) is as before. The more general problem of 
extending to higher order AR forms is more difficult. One way of achieving 
it, however, is to use mixtures (random indexing). For details, the reader 
is referred to Lewis (1985). 

5. DUAL AND ITERATED GAMMA PROCESSES 

The first-order autor egr essi ve Beta-Gamma processes given in equation 
(2.1) has been shown to be t i me-r e ver s i bl e . This can be a handicap in 
modelling phenomena such as water run-offs, which tend to have 'runs down' 
in their sample paths. This is modelled, as noted, in a defective way by 
the GAR(l) process of Gaver and Lewis (1980). We, therefore, look for other 
Gamma processes, possibly with more than one parameter to model dependency 
structure, which broaden the BGAR(l) process. 

The first process to consider is the dual of the BGAR(l). The duality 
refers to the fact that where in (2.1) we decrease the shape parameter, k, 
of by using the Beta-Gamma transform and then bring it up to k by 
adding an independent Gamma variable, we now increase k in X^_^ by adding an 
independent Gamma variate and then decrease the parameter to k by using the 
Beta-Gamma transform. Thus, we have 
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X^(k,B) = B(k,q){X^_^ (k,6) + G^(q,B)}. 



(5.1) 



However, it can be shown that the joint transform of X and X , is 

- - -k ^ 

(1+u) ^(1+v) ^(1+u+v)^ ^ Q J J ^ ^ » so that the process is 

t ime-reversible . We thus have nothing new by way of broadening the BGAR(1) 

process . 

Another approach to broadening the BGAR(l) structure is to iterate the 
process. Thus, in (2.1), the left hand side is a Gamma(k,B) random variable 
and the procedure in (2.1) can be reapplied. However, a t i me-r ever s i bl e 
process is again obtained. A better way to iterate is to apply the GAR(1) 
procedure to (2.1) and obtain a combination of the BGAR(l) and GAR(1) 
processes: 

X (k,B) = Y{B’(kp,kp)X . + Y (kp,B)} ^ G1 (kY,B) (5.2) 

n n n~l n n 

= YB’(kp,kp)X , + YY (kp,B) + GI (kY,B), (5.3) 

n n~1 n n 

where O^YSI, O^p^l, p='Y^1,k>0 and {Gl^(kY,B)} is a sequence 
of i.i.d. Gamma innovation random variables with Laplace-Stiel tjes transform 
( ( B ^Ys ) / ( B + s ) } ^ , independent of {B^} and (Y^(kp,B)}. The condition that p 
and Y do not both equal one is necessary to obtain an ergodic process. Note 
that (5.2) is different from the combination given in Lawrance and Lewis 
(1982) and we denote it by GBGAR(l). 

Now in (5.2), the case Y = 1 gives the BGAR(1) process, Y=0 and/or p=0 
gives an i.i.d. sequence {X^} while p=l gives the GAR(1) process. Thus, we 
should find sample path behavior running from time-reversibil ity to ’runs 
down’ behavior. Also, the process is Markovian and has serial correlation 

p(r) = Orp) kl r = 0,±1 ,±2, (5.^) 

For the joint Laplace-Stielt jes transform of X and X 

^ n n-1 



2 ^ 



, we have 



E{ exp(-uX -vX , ) } 
n n-i 

E[exp(~uYB’ (kp,kp)X - uYY (kp,B) - uGI (kY,6) - vX J] 
n n-i n n n-1 

B . B B 

B+u ) (b+uY B+vJ (b+v+uY 

Thus, it can be seen that the process is not time-reversible unless Y=1 (the 

BGAR(l) case). The regression of X on X , = x is 

n n-1 

E(Xn|Xn_i=x) = p(1)x + { 1 -p( 1 ) } E( X ) , 

which is linear in x, but it is not the same as E(X ,|X ==x ) . 

n-1 ' n 

To separately identify the parameters Y and p in this process, one must 
go beyond second-order properties of the process. This is because the 
parameters enter into the correlation (5.4) as a product. Higher order 
residual analyses (Lawrance and Lewis, 1986) and maximum likelihood 
estimation will be considered elsewhere. 





4 ,x ■ 

n n-1 



6. NEGATIVE CORRELATION AND NON-LINEAR PROCESSES 



All of the processes described above are limited by their serial 
correlations being non-negative. There are a number of ways of extending 
the processes to give negative valued serial correlations and we discuss one 
of them in some detail. All methods involve non-linear functions of, say, 
X^_^ in a first-order autoregressi ve process. This is necessary because of 
the non-negativity and lack of symmetry of Gamma disributed variates. 

6.1. Anti thetic Variates . 

Let X be a continuous random variable with c.d.f. F (x) and inverse 

- 1 ^ - 1 
c.d.f. (a), 0 < a <1. Then the random variable X^ = F^ (1“F^(X)} is 

called the antithetic variable to X. For symmetric two-sided random 

variables centered at zero, X^ = -X. For positive valued variables such as 

Gammas, X^ has the maximum attainable negative correlation for bivariate 
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Gamma pairs (Moran, 1967). In particular, if k=l (Exponential), 

— X 

X* = -£n(l-e ), but if the transf ormat ion is difficult to compute. 

If in (2.1) is replaced by in (2.1), then a very non-linear, 

Markovian first-order au t or egr ess i ve process is obtained. Serial 
correlations beyond lag one are difficult to compute. 

6.2. Coupl ing . 

Gaver and Lewis (1980) introduced a scheme in the context of the GAR(l) 
processes for cross-coupling two Gamma processes so that the marginal 
processes will have negative serial corr elations . It is actually easier to 
implement this scheme for the BGAR(l) process than for the GAR(l) process, 
because the random. Beta distributed coefficients are continuous. We do not 
pursue this here. 

6.2. Inverse Processes . 

A direct scheme for obtaining negative correlation in a Gamma process is 
now given. It is a generalization of a scheme given by Lewis (1983) to 
generate negatively correlated bivariate Gamma pairs. Its utility lies in 
the fact that the sequence can be generated with nothing but i.i.d. Gamma 
variates, no numerical inversions of inverse distribution functions are 
required . 

Thus, let B^(k;q-k), for q > k, be a sequence of independent 
Beta(k;q-k) variates, independent of ^^id G^(k+q), which are 

independent sequences of independent Gamma variates, n=l,2,***. Also, let 
X^(k) be a Gamma(k) variate, where k > 0. The idea is that we want ^^(k) to 
be small when X^_^(k) is large, while retaining the Gamma(k) marginal 
structure of the process. We have 

C;(q) 

X^(k) = B^(k,q-k) (k)^G'(Q) q S k, n = (6.1) 

n-1 ^ n^ ^ 

Note that the ratio is Beta(q;k) and, by the Beta-Gamma transform 
(1.7), the product of this ratio with the independent Gamma(k+q) variable is 
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a Gamma(q) variate. The multiplier B^(k;q“k) reduces the shape parameter 
f rom q to k , 

To obtain p( 1 ) , the correlation between X^_^^(k), we need 



E{X (k)X , (k)} = E{B (k;q-k)}E 
n n-i n 



C;(q)X^_l(k) 



X^_l(k) ^ c;(q) 



E{G"(k^q)l 



k (k -^q ) 

q B 



0'(q)X„.,(k) 



Xr,_,(k) . o;(q) 



( 6 . 2 ) 



To evaluate the remaining expectation in (6.2), we use the fact that in an 
expression such as X^_^^ (k )/ { (k ) + G^(q)}, the denominator is independent 
of the ratio. Then we have, after some manipulation, that 



E 



G’(q)X^_^^ (k) 



Kg 

6(k +q+1 ) ■ 



(6.3) 



Combining (6.2) and (6.3). we get the suprisingly simple result that 



Corr(Xn,Xn_i) = P(l) = - q>k. (6.M 

This correlation is always negative and if k = 1 (the Exponential case), 
p(l) has a minimum attainable value of ~1 /3 when q = k. This is about 
halfway to the minimum attainable correlation for bivariate Exponential 
variables of -0 . 6l , 

The scheme can be iterated to achieve greater negative serial 
correlation. In this and the scheme (6,1), serial correlations of higher 
order are difficult to obtain. However, since the process is Markovian, the 
decay of the absolute values of the serial correlations is geometrically 
bounded . 
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7. AN ANALYSIS OF TIMES BETWEEN FAILURES OF A COMPUTER SYSTEM 



Hugus (1982) used the Beta-Gamma process to analyze a long sequence of 

wind speeds. This sequence is very non-s tat i onar y , containing yearly 

cycles. The model actually used is p(n)G*, where p(n) is a log-linear 

function of n and G* is a unit-mean BGAR(l) process. 

n 

A simpler, stationary series of times between failures of a computer 

system is analyzed here. Although this data is known to be generated by a 

branching Poisson process (Lewis, 1964), the Gamma model is much simpler and 

much more tractable than the branching Poisson process and provides an 

adequate r epr esentat ion for most purposes. Modelling of these times between 

failures is important because, for example, they represent times at which 

requests for service to the computer are made. 

In Figure 7.1, we give a Gamma probability plot for the 256 times 

between failures. The fit appears adequate, but the goodness of fit 

statistics in the table in Figure 7.1 must be used with caution, since the 

data is serially correlated and two parameters have been estimated from the 

data. The parameter k (ALPHA in the table) is estimated as k = 0.704. Note 

the two outliers in the Gamma probability plot. Since these are serially 

adjacent, they probably represent a lapse in recording of computer failures. 

In Figure 7.2, we show a correlation analysis of the data. The decay in 

correlation from p(1) = 0.353 to higher lags is consistent with first-order 

autor egr essi ve correlation structure. The bands in the figure are 

approximate confidence intervals for each Mk) under the assumption the true 

correlation is zero for lag greater than k. (See Box and Jenkins, 1976, p. 

35 for details). Clearly the times between failures are correlated and thus 

a renewal model, say, for these times between failures would be inadequate. 

The partial autocorrelat ion plot in Figure 7.2 also confirms the first- 

order autoregressi ve nature of the data. The last panel in Figure 7.2 shows 

the autocorrelat ion function for the estimated residuals, R =X -p(l)X 

n n n-1 

there is no significant correlation in this series. 

Finally, in Figure 7.3, we give empirical density functions (kernel 

density estimates) for the successive differences, X -X ,, n = 2,3,*** and 

n n-1 

successive ratios X /X , n = 2,3***, which were discussed in Section 2.7. 

n n-1 



28 



The differences show a highly symmetric and long-tailed density function 

which is consistent with an ^.-Laplace distribution. Note that the median of 

the differences is estimated to be -9. Given the range of the differences, 

this is probably not significantly different from the value of zero which 

would hold for the time-reversible Beta-Gamma process. The ratios, X /X ,, 

^ n n-1 

have an estimated mean of 10.812 with estimated standard deviation of 
1 /2 

^2.^32/(255) = 2.66. Thus, there is nothing here to suggest any 

inadequacy in the Beta-Gamma model for charact er i zi ng the data. 
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Figure 7. 1 



Gamma probability plot for times 
between failures of a computer system 
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Autocorrelation structure of times between 
failures of a computer system 



I I 

t 0- 



I I 

t 0 - 



31 



Logs LACS 



EMPIRICAL DENSITY TABLE 



H OO^ODO<Nr^<NL. 

• o CN C 7 » 

■<_J<l 00 <N • • • *1 

0:<Q:fN— -rOmOGOOi 



S CO 
. iO 
O (N 

— 

<1 o 



5_ 

►- u. 
u m O 

^5 '* 

Ui O * 
r trt X ? 1 



5 2 
3 5 



15 

ida ;££ 

. o: ui < ui uj : 
> ^ CN a <7> 



2 2 

> lO (A 2 L 3 UJ u 

_ UJVl*-UiO OO 

uju^ouju)ua:^£K 



2 2 
X X 



-j 






fcl 



(N r\ 

o* irt 



a «o oo u> 

Uj O 0> C7k O 

u. u. r>s . »n o ^ «J 

a> o 

‘CDOrcor^otrco 
£^<0r4r40i — I I I (MO 



o »o 

rs? 



> trt 2 LJ U U 

UJ trt — tJ V O 
Oum/iooc2££ . • 




F igure 7 • 3 
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